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Zero-error feedback capacity via dynamic programming 

Lei Zhao and Haim Permuter 



Abstract 

In this paper, we study the zero-error capacity for finite state channels with feedback when channel state 
information is known to both the transmitter and the receiver. We prove that the zero-error capacity in this case can 
be obtained through the solution of a dynamic programming problem. Each iteration of the dynamic programming 
provides lower and upper bounds on the zero-error capacity, and in the limit, the lower bound coincides with the 
zero-error feedback capacity. Furthermore, a sufficient condition for solving the dynamic programming problem is 
provided through a fixed-point equation. Analytical solutions for several examples are provided. 

Index Terms 

Bellman equations, competitive Markov decision processes, dynamic programming, feedback capacity, fixed-point 
equation, infinite-horizon average reward, stochastic games, zero-error capacity. 

I. Introduction 

In 1956, Shannon [1] introduced the concept of zero-error communication, which requires that the probability 
of error in decoding any message transmitted through the channel to be zero. Although the zero-error capacity 
for general channels remains an unsolved problem (see [2] for a comprehensive survey of zero-error information 
theory), Shannon [1] showed that for discrete memoryless channels (DMC) with feedback the zero-error capacity 
is either zero (if any two inputs can generate a common output) or equal to: 

-l 



C n = max log, 

Px 



max \ Px (x) 
a — j 



V xeG( v ) 



(1) 



where Px is the channel input distribution, y is an output realization of the channel, and G(y) is the set of inputs 
that have a positive probability of generating the output y, i.e., G(y) = {x : Py\x(u\x) > 0}. The achievability 
proof of ([T} is based on a determinist scheme rather than on a random coding scheme, as used for showing the 
achievability of regular capacity. 

In this paper, we study the zero-error feedback capacity for finite state channels (FSC), a family of channels with 
memory. We make the assumptions that channel state information (CSI) is available both to the transmitter and to 
the receiver. In this case, we solve the zero-error capacity that depends only on the topological properties of the 
channel. A similar setup has been used by Chen and Berger [3], who solved the regular channel capacity by finding 
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the optimal stationary and nonstationary input processes that maximize the long-term directed mutual information. 
In [4] and [5], the zero-error capacity of the chemical channel with feedback was derived. The chemical channel is 
a special case of FSCs. With feedback, the transmitter knows the state of the chemical channel while the receiver 
does not, which is different from our setup. Other related work can be found in [6], which addresses the zero-error 
capacity for compound channels. 

The remaining of the paper is organized as follows. In Section[II] we introduce the channel model and the dynamic 
programming problem formulation. In Section [TTIJ we use a finite-horizon dynamic programming (DP) to provide 
a condition for the channel to have zero zero-error capacity. In Section IIVI we define an infinite-horizon average 
reward DP problem and link its solution the the zero-error capacity. In Sections [V] and [VT] we prove the converse 
and direct parts respectively. In Section IVII1 we explain how to evaluate the infinite-horizon average reward DP; 
in particular, we provide a sequence of lower and upper bounds that are easy to compute and prove the Bellman 
equation theorem for the particular DP, namely, a fixed-point equation that is a sufficient condition for verifying the 
optimality of a solution. In Section [Villi we evaluate and then find analytically the zero error feedback capacity of 
several examples. 

II. Channel Model and Preliminaries 

We use calligraphic letter X to denote the alphabet and \X\ to denote the cardinality of the alphabet. Subscripts 
and superscripts are used to denote vectors in the following way: .x J = [x\, ...,Xj) and x\ = (xi, ...,Xj) for i ^ j. 
Next we introduce the channel model and the DP formulation. 

A. Channel model and zero-error capacity definition 

An FSC [7, ch. 4] is a channel that, at each time index, has a state whic belongs to a finite set S and has the 
property that, given the current input and state, the output and the next state is independent of the past inputs, 
outputs and states, i.e., 

P(Vt, s t +i\x\, s{) =p(y t ,s t +i\x t ,st). (2) 

For simplicity, we assume that the channel has the same input alphabet X and the same output alphabet y for all 
states. The alphabets X and y are both finite. Without loss of generality, we can assume that X = {1, 2, 
We consider the communication setting shown in Fig. [T] where the state of the channel is known to the encoder 
and to the decoder. 

An (M, n) zero-error feedback code of length n is defined as a sequence of encoding mappings Xt(m, y* , s*) 
and a decoding function to = g(y n , s" +1 ), where a message m is selected from a set {1, M}. The probability of 
error is required to be zero, i.e., Pr {g(Y n , S n+1 ) ^ m\ message to is sent} = for all messages m G {1,2, ...,M}. 
We emphasize that the size of the message set M does not depend on the initial state of the channel; hence, the 
probability of error decoding needs to be zero for any initial state. 

Definition 1 A rate R is achievable if there exists an (M,n) zero-error feedback code such that R ^ log ^ — . 
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Fig. 1. Communication model: a finite state channel (FSC) with feedback and state information at the decoder and encoder. 



Definition 2 The operational zero-error capacity of an FSC is defined as the supreme of all achievable rates. 

Throughout this paper we use the following alternative and equivalent definition of the operational zero-error 
capacity. 

Definition 3 Let M(n, s) be the maximum number of messages that can be transmitted with zero error in n 
transmissions when the initial state of the channel is s £ S. Define 

a n = minlog 2 M(n, s). (3) 

The operational zero-error capacity is given by: 

Co = lim — — = lim mhW ° g2M(n ' 5) , (4) 

n — >oo 71 n — >oo 72 

where the limit is shown to exist. 

Since the transmitter knows the state, the sequence {a n } is super additive, i.e., a n + m ^ a n + a m and — ^ \X\. 
By Fekete's lemma [8, Ch. 2.6], lim ^ exists and is equal to sup Note that R ^ lim fe holds for any achievable 
rate R, and any rate less than lim ^ is achievable, which are simple consequences of Definition Q] Thus, lim ^ L 
defines the zero-error capacity. 



B. Dynamic programming 

For the standard Markov decision process (MDP), we have the dynamic programming equation [9], [10]: 

U n (s)= max L( S ,a) + y j P( S , \s,a)U n . 1 (s')\ , (5) 

where r(s, a) is the reward, given that we are at state seS, and we perform action a £ A. The term U n (s) is the 
total reward after n steps (a.k.a. the "reward-to-go" in n steps) when we start at time s. The conditional distribution 
P(s'\s, a) is the probability of the next state s' £ S, given the current state s £ S and action a £ A. 
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The dynamic programming equation that is associated in this paper with the zero-error capacity has the form 

U n (s) = max min {r(s, a, s') + U n -i(s')} , (6) 

aGA(s) s'GS(a) 

where r(s, a, s') is the reward, given the current state s, the action a and the next state s'. The reward may be any 
real number, including ±00. The value U n (s) is defined as before, i.e., the total reward in n steps when starting at 
state s. 

The DP equation in (0 may be viewed as a stochastic game [11], which is a.k.a competitive MDP [12], in which 
there are two asymmetric players. Player 1, the leader, takes an action a € A(s), which may depend on the current 
state and Player 2, the follower, determines the next state s' s S. Player 2 sees the state of the game s and the 
action of player 1 . In the zero-error capacity problem, Player 1 would be the user who designs the code to maximize 
the transmitted rate, and Player 2 would be Nature, which chooses the next state to minimize the transmitted rate. 

III. A SUFFICIENT AND NECESSARY CONDITION FOR C = 

Shannon [1] showed that for a DMC, which is an FSC with only one state, if any two input letters have at least 
one common output, it is impossible to distinguish between two messages with zero-error. Using finite-horizon 
dynamic programming, we derive in this section a sufficient and necessary condition for an FSC to have Co = 0, 
i.e., the zero-error capacity is zero. 

Definition 4 Two input letters x\ and X2 are called adjacent at state s if there exists an output letter y and a state 
s' such that p(y, s'\xi, s) > and p(y, s'\x2, s) > 0. 

Definition 5 A state s is positive if there exist two input letters that are not adjacent at state s. 

The intuition behind the result in this section is that if the channel undergoes only non-positive states during the 
transmission, we cannot distinguish between two messages based on the output sequence and the channel state 
sequence, since they could result from either message. 

To determine whether Co = 0, we form the following dynamic programming equation, 

V n (s) = r(s) + max min V n -i(s'), (7) 

x£X s>£S(s,x) 

where Vq(s) — 0,Vs G S, S(s,x) = {s' : p(y,s'\x,s) > for some y e 3^}, and reward r(s) = 1 if state s is 
positive, while r(s) = if state s is not positive. 

Lemma 1 (monotonicity ofV n (s).) The total reward V n (s) is non-negative and non-decreasing in n, i.e., 

< V n {s) < V n+ i(s),Wn = 1,2,3,... and s £ S. (8) 

Proof: Let V n (s) = r(s) + m&x xe x muves(s, x ) V n -i(s') and Vo(s) ^ Vo(s), Vs G S. Then, by induction, 
we have V n (s) ^ V n (s), Vn = 1,2,3,... and s G S. Since r(s) ^ 0,Vs G S, then Vi(s) 0. Let us define 
Vq(s) = Vi(s). Since V (s) ^ V (s), we obtain that V n {s) ^ V n {s), which means that V n+ i(s) ^ V n (s) > . ■ 
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TABLE I 

Interpretation of The DP given in Q, which corresponds to determining whether Co > 0. 



The DP given in IT) 


Interpretation of the DP 


state s of the DP 


state s of the channel 


reward r(s)=l 


state s is positive; at least one bit can be transmitted error-free 


reward r(s)=0 


state s is non positive; no bits can be transmitted error-free 


Player 1 takes action x in order to 
maximize the reward of the DP 


encoder chooses input x in order to 
maximize the number of positive states visited 


Player 2 chooses next state in order to 
minimize the reward of the DP 


Nature chooses next state and output to 
minimize the number of messages transmitted 


V n (s)- total reward in n rounds, 
starting the game from state s 


number of positive states visited in n 
usages of the channel starting at state s, 



This DP can be viewed as a two-person game, where V„ (s) is the game result after n steps starting with initial 
state s. Player 1 chooses the input letter x, and Player 2 chooses the next state s'. Both players know the current 
state s, and the reward of the game is a function only of the current state only, i.e., r(s). Player 1 makes the first 
play, and the two players make alternative plays thereafter. The goal of Player 1 is to maximize the number of times 
the channel visits a positive state, and Player 2 tries to minimize it. The interpretation of the DP as a stochastic 
game between the user and Nature is summarized in Table U 

The following lemma states that if the total reward of the stochastic game is zero after n rounds with initial state 
s, i.e., V n (s) = 0, then only one message can be sent error-free through n uses of the channel with initial state s. 

Lemma 2 V n (s) = implies M(n, s) = 1 and V n (s) > implies M(n, s) > 1. 

Proof: First, we observe that so as to send two or more messages in n uses of the channel, a positive state 
should be visited with probability one. Once a positive state is visited, we can use two inputs that are not adjacent 
to transmit without error one bit (two messages). If a positive state is not visited, then there are no two inputs that 
can distinguish between two messages. 

The stochastic game given in (|7]i verifies whether a positive state is visited with probability 1. In the stochastic 
game, the rewards r(s) = 1 and r(s) = indicate that state s is positive and non-positive, respectively. Player 1 is 
the encoder which wants to visit a positive state and Player 2 is Nature which chooses the output and the state such 
that a positive state will not be visited. A total reward V n (s) = implies that in n transmissions with initial state s, 
with positive probability, the channel undergoes only non-positive states, regardless of the inputs. Thus V n (s) = 
implies M{n, s) = 1. ■ 

According to Lemma [T] V n (s) is non-negative and non-decreasing in n for any s 6 S. Thus, min se 5 V n (s) 
is also nondecreasing in n, and therefore lirrin^oo min^s V n (s) is well defined (it may also be infinite). If 
limjj^oo mirisgs V n (s) = 0, then min se s V„(s) = 0,V?? and invoking Lemma [2] min sS 5 M(n, s) = 1, which 
gives Co = by definition. The next lemma states that to verify whether limj^oo min sSl s V n (s) > 0, it is enough 
to calculate a finite-horizon problem. 
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Lemma 3 



lim min 14 (s) = <==> minVi 5 |(s) = 

n— >oo s£S s£S 



Proof: The => direction follows from Lemma Q] which states that for any s E S, V n (s) is a non-negative 
and non-decreasing function in n. 

Now we prove the <;= direction. Define S n , the set of initial states for which the reward is zero after n rounds of 
the stochastic game, i.e., <S„ = {s E S : V n (s) = 0}. Note that S n +i C S n , So = S and S± = {s E S : r(s) = 0}. 

First, we claim that there exists n*, ^ n* ^ \S\ — 1, for which S n * = S n *+i must hold, where S n * is 
non-empty. Otherwise S n +i has at least one less element than S n for ^ n ^ \S\ — 1, and therefore 6>|5| = 0. If 
is empty, it means that min sS 5 V|s|(s) > 0, which contradicts our assumption. 

The equality between S n * and <S„.+i means that when the channel starts at some s E 6>„*+i, for any input letter 
x, there exists an action of Player 2 such that the next state s' would satisfy s' E S n * . Define this strategy of Player 
2 as a function ■) : Sn* x X i— > S n *, namely, given s E S n *, and any input x E X, the next step s' depends 
on s and x by the function Az{s,x) such that s' = A2(s,x). We claim that <S„.+fc = <S„*,V/s ^ 0, i.e., once the set 
S n stops shrinking, it will stay the same. To prove this, let us fix an arbitrary s E S n * + i. Since Si C S n , s E Si 
and r(s) = 0. We have 

K*+2(s) = r(s) + max min V n * + i(s') 

xEX s'eS(s,x) 

= max min V n *+i(s') 

X£X S'£S(S,X) 

^ maxK*+i(^2(s,x)) 

xGX 

= 

Therefore S n *+2 = S n *+i. Repeating the same argument, we have S n *+k = 5 n «,V/c > 0, which means that 
V n (s*) = 0,Vn. This completes the proof. ■ 

The following theorem state the necessary and sufficient condition for Co = through the stochastic game. 
Theorem 1 The zero -error capacity is positive if and only if the total reward min se< s V|s|(s) is positive, i.e., 

minVf S |(s) = <S=S> C = 0. (10) 

Proof: 

If min se s V|5| (s) = 0, then according to Lemma [3] lim n ^oo min se s V n (s) = 0, and following Lemma [2] it 
follows that min s M(n,s) = 1 for any n; hence Co = 0. 

If min s6 5 V|5|(s) > 0, then in according to Lemma|2] min se< s M(|<S|, s) ^ 2, and following from the definition 
of zero-error capacity Co ^ rgr ■ ■ 



7 



IV. The Dynamic Programming Problem associated with the Channel 

In this section, we define a dynamic programming problem associated with the channel. The solution to the 
problem is later used to determine the feedback capacity of the channel. 

Denote G(y, s'\s) = {x : x G X,p(y, s'\x, s) > 0}, i.e., G(y, s'\s) is the set of input letters at state s that can 
drive the channel state to s' while yielding an output letter y with positive probability. Denote W(-, •) as a mapping 
Z+x5h R+ Set W(0,s) = l,Vs G S as the initial value. Denote P X \s(-\-) as a mapping X x S i-> R+ 
such that for each s e S, Px\s('\ s ) is a probability mass function (pmf) on A", i.e., J2 x ex \s( x \ s ) — 1> an d 
Px|s(x|s) ^ 0,Vx G A". The term W(-, •) is the solution to the problem defined iteratively by: 

-l' 

> 



Win, s) = max min < Win — 1, s') 



yey f — 

KGG(y,s'|s) 



(11) 



Vs G 5, and for n = 1, 2, 3, ... 

We adopt the convention that i = oo, and, if G(y, s'\s) = 0, J2 xeG ( y s <i s ) Px|s(x|s) = 0. One property that can 
be verified from the definition and the initial value is that Vn 0, Vs G <S, W(n, s) ^ 1. 
The main result of this paper is the following theorem: 

Theorem 2 If mm seS V\s\ (s) > 0, 



Co = Km inf — min log 2 W(n, s) ; 



(12) 



Otherwise Co = 0. 

Before proving the theorem, let us verify that the zero-error capacity of a DMC [1, Theorem 7] is a special case of 
Theorem[2] Since a DMC is an FSC with only one state, Vfsi(s) = means that the state is non-positive, i.e., "all 
pairs of input letters are adjacent", as stated in [1, Theorem 7]. If Vjs|(s) > 0, for a DMC, define M(n) = M(n, s) 
and G(y)=G(y,s'\s). 



M in, s) 



max < Min 
Px\s(-\s) 1 



Min — 1) max 

Px 



ma S Yl p x\s(x\s) 
yey * — ' 

xeG(y) 



max > Pxix) 

x£G(y) 



(13) 



and 



Co 



lim inf — log 2 M (n) 

n—>oc n 



log 2 



max 

yey 



xeG{y) 



(14) 



which is exactly the result for DMC in [1]. 
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The converse and the direct parts of Theorem [2] are proved in Section [V] and Section [VTJ respectively. 

V. Converse 

Theorem 3 (Converse.) M(n, s) ^ W(n, s), Vn = 0, 1, 2, .... and Vs G S. 

Proof: We prove the theorem by induction. First, the inequality holds when n = 0. 
Now, suppose M(k,s) ^ W(k,s) is true Vfc = 0, — 1 and Vs £ S. Fix an arbitrary initial state so- It is 
sufficient to show that M(n, so) ^ W(n, So) to prove the converse. 
For a fixed zero-error code that has M(n, sq) messages, we define 



u(x\so) =number of messages with first transmitted 



f{x\s ) 



letter x when initial state is sq, 

u(x\s ) 



(15) 



M(n,s ) 
Note that /(-|so) is a valid pmf. 

After the first transmission, suppose the output is some y £ y and the channel goes to state s\. We have 
YlxEGfy si|s ) u ( x \ s o) messages, each of which with positive probability gives output y and changes the state to 
si. To guarantee that the decoder can distinguish between these messages in the following n — 1 transmission, we 
must have Y.x^G{y, Sl \ SQ ) u ( x \ s o) ^ M ( n ~ 1) s i)' which yields 



M(n,s ) f(x\s )^M(n-l, Sl ). 

x£G(y,si\so) 



(16) 



Since the above inequality must hold, \/y £ y, and Vsi £ S 



M(n, s ) < min M (n - 1, si) 



max f( x \ s o) 



(17) 



Since we assumed M(n — 1, s) ^ W(n — 1, s) for all s £ 5, 



M(n, So) < min W(n — 1, si) 



ma , x XI /( x l s o) 

J/6 J' f — 

xeG(y,si|s ) 



(18) 



Using the iterative formula of W^(n, sq) given in (fTTT i and the fact that /(-|so) is a valid pmf, we obtain 



M(n,s ) < W(n,« ). 



(19) 



Finally, since So is arbitrarily fixed, we have M(n, s) ^ H^(n, s), Vs £ 5. By induction, the theorem is proved. 
From the converse, Theorem [3] and the zero-error capacity definition [3] we have the following upper bound 

min se5 log 2 M(n, s) 



Cq = lim 



^ lim inf 

n— >oo 



min se5 log 2 W(n, s) 



(20) 
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VI. Direct Theorem 



Theorem 4 Assume min se 5 V151 (s) > 0, then for any initial state s € <S f/jere e^'ifi an no > smc« that for 
n > no, L^( n i S )J messages can be transmitted with no more than n + \S\ |~log 2 L~\, where L is a positive integer 
that does not depend on n and s. 

Proof: The direct part is proved using deterministic codes [1] rather than random codes. Let the solution and 
the maximizer in the fcth iteration (k = 1, 2, n) of (fTTT i be W(k, ■) and Px\s('\')' respectively. 

Suppose that at the first transmission the channel state is Si and the total number of messages transmitted through 
the channel is [W^n, Si)J. We divide the message set into \X\ groups and transmit x = i for the messages in the 
ith group for the first transmission. Let m, denote the number of messages in the ith group. By similar arguments 
to those in [1, p. 18], we can control the size of each group such that: 



if P^(i\ Sl ) > 0, 



[W(n, Sl )\ 



1 



LW(n,«i)J 



(21) 



if P (n) 



Jf|s(*l s l) = °! m i = 0. 



Both the transmitter and the receiver know how the messages are divided before the transmission. An arbitrary 
message m € {1, si)J} is selected, and letter i is sent if m belongs to the ith group. The number of 
messages about which the receiver is uncertain before the first transmission is Z\ = [M(n, si)J. 

After the first transmission, we obtain an output y%, and the channel state changes to S2- Denote Z2 as the 
number of messages that are compatible with (yi, S2), i.e., when transmitting those messages, (j/i, S2) is obtained 
with positive probability. Z2 can be upper bounded in the following way: 

Z 2 = ^2 m x 

x£G(y 1 ,s 2 \s 1 ) 

= [W(n, 8l )\ \wT X \ l 

r— ' , s Win, Si) 
xeG( yi ,s 2 \ Sl ) v ' LJJ 

^ / , ^ 1 \ (22) 

<[W(n, Sl )\ V [P$> 3 (x\8 1 ) + - 



I ■ \w(n, Sl )\ 



L^(n, Sl )jmax ]T (^(zh) 

KGG(l/,S2|si) 




For convenience, let us define 



jW( S)S ')=m^ J] P i?sW s ) 



(23) 



Eq. (fTTT i and (1221 can be written , respectively, in terms of j( k '(s, s') as: 

W(fc,«) < W(k- i,s') [j (fc) (s,s')l \vfceZ+seS,s'eS. (24) 



Z 2 ^[W(n,s 1 )\J^(s 1 ,s 2 ) + \X\ 
^W(n-l,s 2 ) + \X\, 

where the last inequality is due to d24l i. 
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Since both transmitter and receiver know S\ and s 2 and the transmitter knows the output y\ through feedback, 
both of them know which messages are compatible with (yi, s 2 ). In the second transmission, the transmitter can 
further divide the remaining Z 2 messages into groups according to Px\s \'\ s ^)' similar to eq. (fJTJ. The way the 
messages are divided is known to the receiver. Suppose the output letter is y 2 and the state goes to S3. Following 
the argument in the previous iteration, we have 

Z 3 ^Z 2 J (n ^(s 2:S3 ) + \X\ 

(a) 



^ W(n - 1, s 2 ) J {n - 1] (a 3 , s 3 ) + \X\ (l + J {n - l \s 2 , a 3 )) 

iV(n-2,« 3 ) + |*| (l + ^ n_1) (*2,«3)) , 
where steps (a) and (b) follow from dZSt and (|24| |. respectively. 



(26) 



As the transmission proceeds, the channel state evolves as s\, s n , s„+i, and the output sequence is y\, y n . 
The transmitter divides the remaining uncertain messages according to P^\-\sk) for each transmission. After the 
nth transmission, the number of messages remaning can be upper bounded as: 

Z n +i 

< Z n J^{s n ,s n+1 ) + \X\ 

< 1 + \X\ (l+ J (1) (s„,S„+l)+ J il \sn,Sn+l)J {2) (s n _ 1 ,S n ) + ■■■+]] J« («„+!_;, S^;)^ (27) 

Using Ineq. (124-b iteratively, we obtain 



W(k, s Il+ i_ fc ) < 
hence we can further upper bound Z n+ i as 
Z n+1 < 1 + \X\ (1 



^ J W (s Il+ i_ !; , S Il+2 - 

.4=1 



1 



(28) 



(29) 



W{l,s n ) W(2,s n . 1 ) Win-1,82), 

Recall the assumption of the theorem min se s V(|<S|, s) > 0, which implies, via Theorem [U that Co > 0, and 
follows from Theorem [3] we obtain that 



lim inf min — log M(n, s) > 0. 
n— >oc seS n 



(30) 



Hence, there exists e > and an integer uq such that Vs £ S, Vn > no, W(n, s)^M(n, s)^2 en (the first inequality 
is due to the converse proved in the previous section, and second inequality is due to d30ll). Recall that M(n, s) ^ 1; 
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we can thus further upper bound 

1 +E M / (Mn+1 _ fc) + E 2 " 




(31) 



=1 + X n + 1 



=L. 

Note that L is finite and is independent of n and s\. This means that after n transmissions, the number of messages 
about which the receiver is uncertain is not more than L. 

The assumption that min se ,s Vfsi(s) > implies that we can drive the channel to a positive state with probability 
1 in less than |«S| transmissions. In a positive state, we can transmit 1 bit of information with zero-error; hence 
we can now conclude that there exists a zero-error code such that |_W(n.j s)J messages can be transmitted with no 
more than n + \S\ |~log 2 L] transmissions. ■ 

Based on the direct theorem, it is straightforward to derive a lower bound on the zero-error capacity: 

C ^ lim mf mm — — — — 

n-oo ses n + \S\\log 2 L\ 

= lim inf — min log 2 W(n, s) , 

n~*oc n s£S 

given the condition min s(E 5 V151 (s) > 0. Combining ineq. ( f20l > and ineq. d32] i, we have proved eq. ( fT2l ) thus 
Theorem |2] 



VII. Solving the Dynamic Programming Problem 

Throughout this section, we assume that min se s Vlsi(s) > 0, i.e., we focus on channels with positive zero-error 
capacity. Let us first introduce a few definitions so that we can use the standard language of dynamic programming 
to rewrite Eq. (TTTb in the form of Eq. ((6]). Basically, we take log 2 on both sides of Eq. ( fTTT ). Define the value 
function as J n (s) = log 2 W(n, s), the action as a = Px\s('\ s )> an d me reward as 

-1 



r(s',a,s) = log 2 



And the DP equation in ( fTTT i becomes simply 



max P x \s{x\s) 

x£G(y,s'\s) 



(33) 



Jn(s) = max min \r(s' , a, s) + J n _i(s )\ . 

aGA s'eS 



(34) 



where A is the action space, A = {f(x) : J2 X f( x ) = f( x ) ^ ^}. 
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Theorem |2] states that 

C =liminf min ^ J " (s) . (35) 

n — >oo Tl 

Define an operator T as follows, 

(ToJ)(s) = max min{r(s', a, s) + J(s')} . (36) 

aGA(s) s' 

The DP equation can be rewritten in a compact form as follows, 

J n (s) = (ToJ n _ 1 )(s), (37) 
with initial value Jo(s) = 0. We also denote T" as applying operator T n times. 
Lemma 4 Let W and V denote two functions S <— ► M + . The following properties ofT hold: 

(a) // W(s) > V(s), Vs G 5, r/ie« T o W(s) )To V(s) Vs G S. 

(b) // PV(s) = V(s) + dVseS, where d is a constant, then T o W(s) = To V{s) + d, Vs E S 

Proof: Both parts of the lemma follow directly from the definition of T. ■ 
Lemma 5 The following properties of J n hold: 

(a) The sequence {min s J n (s)} is sup-additive, i.e., min s J n +m(s) min s J n (s) + min s J m (s) 

(b) The sequence {max s J„(s)} is sub-additive, i.e., max s J n+m (s) ^ max s J n (s) + max s J m (s) 

Proof: We prove the first property here. The proof of the second one is similar. 

min J n+m (s) = min(T™ o J m )(s) 

s s 
(a) 



^ min ( T n o min J m (s ) J (s) 
= min (r n o [J + min J m (s')]j (s) 
= min(T™ o Jq)(s) + min J TO (s'), 

s s' 

where the steps (a) and (b) follow from parts (a) and (b) of Lemma [4] respectively. 
Theorem 5 The liminf in Theorem\2\can be replaced by lim, i.e., 



and for all n G Z + the following bounds hold 



(38) 



Co = lim min^^, (39) 

n — >-oo s n 



Jn(s) J n (s) 

— — ^ Co ^ max . (40) 
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Proof: Following Lemma [5] and Fekete's lemma [8, Ch. 2.6], we obtain the following two limits: 

,. Jn{s) Jn{s) 

lim mm = sup mm , 

n-too s TL n s n 

lim max = mi max . (41) 

n— >oo s TL n s 71 

Finally, from Theorem [2] we obtain: 

J k(s) J n (s) . J n (s) . Jfe(s) 

max lim max ^ Co = lim mm ^ mm — : — , (42) 

s n n^oo s n n— >oo s n s k 

for all k e Z+. ■ 

Eq. ( f40b provides a numerical way to approximate Co. We now alter to the case that an analytical solution in 
the limit can be obtained via Bellman equations. 

Theorem 6 (Bellman equation) If there exists a positive bounded function g : S i— > M. + and a constant p that 
satisfy 

g(s) + P= (Tog)(s) (43) 

then lim^oo i J n (s) = p. 

Proof: Assume that there exists a positive bounded function g : S i— > M + and a constant p that satisfy 
g(s) + p = (To p)(s). Define go( s ) = 5( s )> 9n(s) = T n g (s). Since Jo(s) = ^ .9o(s), then according to part 
(a) of Lemma [4] J„ (s) ^ g n (s)- Let <i = max s ^(s). Then Jq + d go- Hence, according to part (a) of Lemma[4] 
5n(s) ^ ^n(s) + d. Therefore we have, 

9n(s) - d < Jn(s) ^ fifn(s). (44) 

Finally, g(s) + p = (T o .<7)(s) implies that linin^oo SulSl = p; hence limn^oo J "^^ = p. ■ 
Remark: p does not depend on the initial state, which hints that for some decomposable Markov chains, it is 
impossible to find a g : S i— ► R + and a constant p to satisfy the Bellman equation. 

VIII. Examples 

Here we provide three examples and solve them analytically. For the first two examples, we also find the regular 
feedback capacity using [3]. 

Example 1 We consider the very simple example illustrated in Fig. [2] The channel has two states. In state 0, the 
channel is a binary symmetric channel (BSC) with positive cross probability. In state 1, the channel is a BSC with 
cross probability. Roughly speaking, in state 0, the channel is noisy, and, in state 1, the channel is noiseless. 
Suppose the channel state evolves as a Markov process and is independent of the input and output. If the current 
state is 0, the next channel state is 1 with certainty. If the state is 1, the channel goes to state with probability 
p > or stays at state with probability 1 — p. Thus, the channel stays in the noisy state a geometric length of 
time, and returns to the perfect state immediately. 
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Fig. 2. Channel topology of Example 1 



Finding C by calculating W{n 7 s): for this channel G{y, 0|0) = 0, G(y, 1|0) = {0, 1}, G{y, 0|1) = G(y, 1|1) = 
{?/}. Using eq. (fTTT i , we have the solution to the DP problem of the 1st iteration as 



W(l, 0) = max min {1, 1} = 1 

Px\s(-\o) 



W(l, 1) = max 



:{P X | S (0|1),P X | S (1|1)} 



=2. 



For the 2nd iteration, we have 



W(2,0)= max 

^xis(-IO) 



W(l,l) min {1,1} 



W(2,l)= max W(1,0) 

Px\s(-\1) 



2. 



max{P X | S (0|l),P X |5(l|l)} 
By induction and some simple algebra, we obtain the solution to the DP problem at the nth iteration: 



W(n, 0) =2L"/ 2 J , and W(n, 1) = 2 T"/ 2 ! . 



Thus 



(45) 



(46) 



(47) 



Co = 1/2. (48) 

Alternatively, we can solve the example by funding a solution to Bellman equation ( |43T >. 

Finding Cq via Bellman equation: the Bellman equation for the channel is simply the following, 

ff(0)= fl (l)-p, 

(49) 

.9(1) = 1 +. 9 (0) -p. 

Using simple algebra we obtain p — |, g(0) — v, g(l) = v + h. We note that we can achieve the zero-error capacity 
with feedback and state information simply by transmitting 1 bit of information whenever the channel state is 1 . 

Finding the regular feedback capacity C' : To calculate the regular capacity we use the result of Chen and Berger 
in [3, Theorem 6]. The theorem states that if the channel is strongly irreducible and strongly aperiodic, then the 
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capacity is 

C = max V Tr k I(X; Y\S = k), (50) 

Px|S k=a 

where irk is the equilibrium distribution of state k induced by the input distribution Px\s- 
The channel is strongly irreducible and strongly aperiodic if the matrix T that is defined as 

T{k, I) = min{Pr(5, = l\X k = x, S 4 _ x = k)} (51) 

X 

is irreducible and aperiodic for any x £ X. Since the transition probability of the state does not depend on the 
input, and since the state transition matrix is irreducible and aperiodic for any p < 1, the capacity is given by ( f50l >: 
hence 

C(p) = maxTr I(X;Y\S = 0) + n 1 I(X;Y\S = 1) 

= 71"! 




0.2 0.4 0.6 0.8 1 

P 

Fig. 3. Feedback capacity and zero-error feedback capacity of the channel in Example 1 for different values of p = Pr{5 = 1\S = 1}. 

Example 2 Let us consider another channel with two states as illustrated in Fig. |4] In state 0, the channel is a 
Z-channel. In state 1, the channel is a BSC with cross probability. The next channel state is determined by the 
output. If the output is 0, the channel goes to state 0; if the output is 1, the channel goes to state 1; hence the 
regular feedback of the output includes the state information. 

It is tempting to make full use of state 1, i.e., to transmit 1 bit of information, but as a consequence the channel 
goes to the undesirable state half the time, and the rate would be only i. 

Finding C by calculating W{n,s): For this channel, G(0,0|0) = {0}, G(1,1|0) = {0,1}, G(0,0|1) = {0}, 
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G(l, 1|1) = {1} and all the other combinations yield empty sets. For initial state 0, we have 

W(n,Q) = max min / W J; n ~ ]' ^ , Win - 1, 1) 
=W{n- 1,1) 

The maximum is achieved by setting -Px|s(0|0) = 0. For initial state 1, we have 

. ; ty(n-l,0) W^n-M) 
Vvfn, 1) = max mm < . , s . , , ; 

V ^ Px |S (-ID I Px\s(0\l) Px\s(l\l) 
W{n -2,1) W(n- 1,1) 



(53) 



max mm 



Pxism I P X \s(0\l) ' Px|s(l|l) 
=W(ra-2,l) + W(ra-l,l) 



(54) 



By setting P(0|1) = i) ' the maximum is achieved. Recall W(0, 1) = 1. Notice that W{1, 1) = 2, 

which can be computed directly. Thus, both W{n, 1) and W(n, 0) are a Fibonacci sequences (with proper shifts). 
Therefore, lim log2 w< - n - 1 *> — li m log2 w ^ n ^°) = w 1±^I. From Theorem El we have 

C = l 0g2 « 0.6942, (55) 

which is the log of the golden ratio. Here, we list the first few values of W(n, s) in Table HH 



TABLE II 

W (n, s) WHICH EQUALS TO THE NUMBER OF MESSAGES THAT CAN BE TRANSMITTED ERROR-FREE THROUGH THE CHANNEL IN EXAMPLE 

[2]lN n STEPS STARTING AT STATE S 



S 


1 


2 


3 


4 


5 





1 


2 


3 


5 


8 


1 


2 


3 


5 


8 


13 



Finding Co v;a a Bellman equation: Since the channel input is binary, the actions are equivalent to two numbers: 
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p = P x \s(0\0), Pi — Px|5(0|l). Bellman's equation become 



J(0) + p = max mini log— + J(0), J(l) 
o<p <i I Po 

J(l)+p = max mini log— + J(0),log — - h J(l) }• (56) 

o«;pi^i L Pi i-pi 



which implies that po = and 

J(0) = J(l)-A 

J(l) = J(0) + log 2 i - p, (57) 

Pi v y 

log 2 1 + J(0) = log 2 - J— + J(l) 
Pi 1-Pi 

the solution of which is p = log 2 1 Pi = ■ 

It is of interest to observe that starting at state 1, any binary sequence with length n and no consecutive O's 
can be transmitted with zero-error in n transmissions. The number of such sequences as a function of n is also a 
Fibonacci sequence. Since we can always transmit a 1 to drive the channel from state to state 1, this is actually 
one way to achieve the zero-error capacity. 

Finding the regular feedback capacity C> : This channel is not strongly irreducible, since the matrix transition 
PSi\Si-!.x=o is not irreducible; hence, the stationarity of the optimal policy used by Chen and Berger [3] requires 
additional justification. By invoking theory on the infinite-horizon average-reward dynamic programming we show 
that a stationary policy achieves the optimum of the DP and hence Eq. (f50T > holds. 

The feedback-capacity of the channel in Example 2 can be formulated according to [3] and [13] as: 

1 N 

C= Urn - max V I{X n ; Y n \S n ), (58) 

JV-,00 N {Px nlSn }^ =1 ^[ 

and this is equivalent to an infinite-horizon average-reward DP with finite state space and compact actions where: 

• the state of the DP is the state of the channels i.e., S n , 

• the actions of the DP are the input distributions po £ [0,1] and p\ G [0,1], where po = -Px|s(0|0), pi = 
^|s(0|l). 

• the reward at time n given that the state of the DP is or 1 is I(X n ; Y n \S n = 0) = Hb(pop) — poHb(p) or 
I(X n ;Y n \S n = 1) = H b (pi), respectively, 

• the transition probability given the actions pi and p 2 is Ps n \s ri _ 1 (Q\l) = Pi and Pg n |s„_ 1 (0|0) =poP- 
Next, we claim that it is enough to consider the action p\ E [e, 1] for some e > 0. First we note that for e ^ i 



H{2e) > H(e) + e, (59) 



since dH ^ x ^ > i for x < 4. 

ax A 



Next we show that it is never optimal to have an action p\ ^ ^. Let J„(0) and J„(l) be the maximum rewards 
to go in n steps starting at state and 1, respectively, and let assume that the optimal action in state 1 is p\ < ^, 



then 

J„(0) ( = } HipD + il-pDJ^x^+plJn-xil) 

= } H{px) + (1 - 2 P l)J n -i(0) + 2plJ n - 1 (l)+p* 1 (Jn-l(0) - Jn-l(l)) 

( | + (1 - 2p* 1 )J„_ 1 (0) + 2p*J„_ 1 (l) +p* 

(d) 

< F(2p 1 ) + (l-2rf)J„_ 1 (0)+2^J„_ 1 (l), (60) 

where step (a) follows from the dynamic programming formulation; step (b) follows from the fact that we added 
and subtracted p:f(J n _i(0) — J n _i(l)); and step (c) follows from the fact that J„_i(0) — J„_i(l) ^ 1; this is 
because we can choose po = 0, which means that in one epoch time we can cause the state to change from to 1 
with probability 1, and the reward in one epoch time is always less than 1. Finally, step (d) follows from ([59}. Since 
step (d) corresponds to the action 2p\, it implies that an optimal policy would never include the action p\ < i. 

Now we invoke [9, Theorem 4.5] that states that if the reward is a continuous function of the actions, and for 
any action the corresponding state chain is irreducible (unchain), then the optimal policy is stationary. Since the 
reward function is continuous in po,pi and since for any po G [0, l], P i G [g, 1] the state process is a irreducible, 
we conclude that the optimal policy pl,p^ is stationary (time-invariant), and therefore the capacity is given by d50l >. 




0.2 0.4 0.6 0.8 1 

P 



Fig. 5. Capacity and zero-error capacity of the channel in Example 2 for different values of p = Pr{Y = 0|X = 0, S = 0}. 



Now, using ( TSOb , we obtain that the regular feedback capacity as a function of p is 

C f (p) = max(7r (H b (p p) - p H b (p)) + -KiH^px)), (61) 

PO.Pl 

where (7r , tt\) are the equilibrium distributions given by 7r = 1+ and 7Ti = 1 — 7r . Fig. [5] shows a numerical 

evaluation doTT l as a function of p. 

Example 3 We consider here an example with three states with a trinary input and trinary output. The topology of 
the channel is depicted in Fig. [6] The channel conditional distribution P(s' , y\x, s) has the form of P{s', y\x, s) = 
P(s'\x, s)P(y\x, s), where state s = is a perfect state , s = 1 is a good state and s = is a bad state; the states 
1,2,3 can transmit log 3, 1 and bits with zero error probability. 
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We first evaluate the zero-error capacity numerically using the dynamic programming value iteration, i.e., Eq. 
d40T >. and then, using the numerical evaluation, we conjecture an analytical solution, which we verify via the Bellman 
equation. 

x=l,2,3 




Fig. 6. Channel topology of Example 3. 



Evaluating Cq using a value iteration algorithm: We calculated 50 iterations of the DP value iteration formula 
given in d34b . The action space of player 1 is the stochastic matrix Px\s> an d we quantize each element of the 
stochastic matrix with a 10 -4 resolution. Fig. [7] depicts the value of max s J n (s) and max s J„(s) which according 
to Theorem [5] are upper and lower bounds, respectively, on the zero-error capacity. 

After 50 iterations, we obtain that the first player's action Px\s is given by 



P 



x\s 



0.4656 0.3177 0.2167 
0.3177 0.6823 

1 



(62) 



and the the reward J5o(s) — J4g(s), which is an estimate of the zero-error capacity, is 1.10283 for all s £ 0,1, 2. 



J„(s) 




J 5 o(s) - J 49 (s) = 1-1028 



Fig. 7. Upper bound, max s J n (s), and lower bound, min s J n (s), on the zero-error feedback capacity of the channel in Example 3. The value 
•^50 ( s ) — J&9 (s) = 1.102 is an estimate of Cq. 



Analytical solution via Bellman equation: We conjecture that the optimal policy of Player 1 is a stochastic matrix 
of the form given in d62j, i.e., P X \s(^) = ^x|s(l|0), and P*| S (0|1) = P*|s(0|2) = P X]S {1\2) = 0. Based on 
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this assumptions and the notation a = P X \s(0\0) and ai = Px\s(l\0), the Bellman equation becomes: 



P + J(0) 



maxminj— logao + J(0), — logai + — ^0^(1 — ai — a ) + J(2)} 



P +•/(!) 



maxminj— logai + J(2), — log(l — ai) + J(0)} 



p + J{2) 



J(0). 



(63) 



Using simple algebraic manipulation, we obtain that 



P = 




(64) 



which implies that a\ = 1 + u — where u 



l 

2 



+ V I + W> hence fll = 0-31767... and 



Co 



log(l-ai) = 1.102926.... 



(65) 



IX. Conclusions 



We introduced a DP formulation for computing the zero-error feedback capacity for FSCs with state information 
at the decoder and encoder. The DP formulation, which can also be viewed as a stochastic game between two 
players, is a powerful tool that allows us to evaluate numerically the zero-error feedback capacity and in many 
cases as shown in the paper, to find an analytical solution via a fixed-point equation. 
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